Pollinator Indicators

Authors
Affiliation

Kwaku Peprah Adjei

Norwegian institute for Nature Research

Anders Kolstad

Norwegian institute for Nature Research

Markus A.K. Sydenham

Norwegian institute for Nature Research

Published

August 19, 2024

Status

This indicator documentation is incomplete and needs further developement before indicator values can be calculated.

Recomended citation

Adjei, K. P, Kolstad, A. , Sydenham, M. Pollinator Indicators Pollinator Indicators v. 000.001 ecRxiv https://github.com/NINAnor/ecRxiv

Version

000.001

Table 1: Indicator metadata
Indicator ID NA
Indicator Name Pollinator Indicators
Country Norway
Continent Europe
Ecosystem Condition Typology Class NA
Realm Terrestrial (T)
Biome NA
Ecosystem NA
Year added 2024
Last update 2024
status incomplete
Version 000.001
Version comment First draft
url https://github.com/NINAnor/ecRxiv

Pollinator indicators





1. Introduction

We aim to model the diversity of pollinators in Norway given their know interaction with some plant species. From the estimated diversity across the nation, we intend to look at the estimates in the ASO Data meadows by using their values as the reference values. To achieve this, we do the following:

  • obtain data from GBIF for bumblebees, butterflies and bees

  • fit an integrated species distribution model (ISDM) to the data to obtain pollinator intensities

  • obtain a web plot of the pollinator and plant species

  • get data on the interaction plant species from GBIF

  • fit an ISDM to the plant species data to obtain the species occurrence probability

  • estimate the alpha diversity index of the pollinators across Norway

Figure 1: Flowchart of the modelling framework used in estimating the pollinator diversity indices. The modelling start with A) fitting ISDMs to the pollinator data, then B) fitting ISDMs to the plant data, C) constructing the web interactions between the plant and pollinators and D) estimating the alpha diversity indices. The dataset and covariates are enclosed in dashed green-colored rectangular regions.

2. About the underlying data

The pollinator and plant species data used to fit the ISDMs were obtained from GBIF. Cite the DOI of the downloaded data.

2.1 Spatial and temporal resolution

Both pollinator and plant data obtained was for the entire Norway observed within \(2005\) to \(2024\). For each dataset within the downloaded GBIF data, we formatted them based on their meta data into presence-only and presence-absence data. Then, we merge all the presence-absence datasets into one dataset: mergedDatasetPA and all the presence-only datasets into one dataset: mergedDatasetPO. The number of occurrences in each dataset for the pollinators are presented in Table 2.

Table 2: Number of occurrence records for each pollinator in the merged presence-absence and presence-only datasets
Dataset Pollinator No. of observations
Presence-only Bees 84485
Butterflies 474857
Hoverflies 79683
TOTAL 639025
Presence-absebce Bees 40429
Butterflies 39321
Hoverflies 51861
TOTAL 131611
Figure 2: Distribution of 54 plant species in the merged presence-absence dataset.
Figure 3: Distribution of 54 plant species in the merged presence-only dataset.
Table 3: Number of plant species occurrences from the merged presence-absence and presence-only datasets obtained from GBIF.
Plant species Presence-absence Presence-only
Ajuga pyramidalis 72650 4427
Astragalus alpinus 155978 2283
Campanula rotundifolia 218345 13558
Carum carvi 47681 2962
Filipendula ulmaria 163248 18103
Galium album 65106 3734
Galium aparine 45202 1926
Galium boreale 67332 8338
Galium elongatum 43791 1503
Galium palustre 84951 2598
Galium saxatile 5048 1619
Galium uliginosum 49728 2061
Galium verum 23600 3619
Geranium robertianum 49141 6148
Geranium sylvaticum 238108 15692
Gymnadenia conopsea 458 3659
Hieracium murorum 764 30
Hieracium umbellatum 9808 3985
Hieracium vulgatum 13483 39
Hypochaeris radicata 1988 1399
Knautia arvensis 50898 6609
Lathyrus linifolius 91357 6438
Lathyrus pratensis 77899 5923
Lathyrus vernus 56287 2228
Leucanthemum vulgare 125584 9992
Lotus corniculatus 64438 12485
Nardus stricta 47131 7179
Origanum vulgare 767 2484
Plantago lanceolata 8672 5638
Plantago major 6730 7200
Plantago media 1389 2442
Silene dioica 56046 7645
Silene vulgaris 33839 3742
Stellaria graminea 180580 8687
Stellaria media 47587 4101
Stellaria nemorum 62024 4250
Trifolium medium 35383 4670
Trifolium pratense 147575 12698
Trifolium repens 97671 11745
Trollius europaeus 44249 5928
Valeriana sambucifolia 68074 108
Veronica alpina 17738 626
Veronica chamaedrys 175299 9228
Veronica officinalis 223994 11936
Veronica serpyllifolia 5988 3153
Vicia cracca 83519 10301
Vicia sepium 112154 6270
Vicia sylvatica 45135 2214
Viola biflora 3518 3612
Viola canina 123557 3602
Viola epipsila 611 499
Viola palustris 83363 8928
Viola riviniana 425049 9602
Viola tricolor 88517 4075

Iintend to keep either the table or the figures in the final document, depending on what is needed for the report.

2.2 Original units

Is it presence-absence or presence only. Is it abundance or etc.

2.3 Additional comments about the dataset

3. Indicator properties

Anders will help with this

3.1. ECT

3.2. Ecosystem condition characteristic

3.3. Other standards

3.4. Collinearities with other indicators

4. Reference condition and values

4. 1. Reference condition

4. 2. Reference values

4.2.1 Minimum and maximum values

Reference values for diversity indices at the ASO data meadows.

ggplot(asoDataMeadows) +
  gg(asoDataMeadows["overallDiversity"], aes(col = overallDiversity)) +
scale_fill_gradientn(colours = terrain.colors(30))+
  geom_sf(data = regionGeometry, alpha = 0.1)

4.2.2. Threshold value for defining good ecological condition (if relevant)

4.2.3. Spatial resolution and validity

5. Uncertainties

6. References

7. Datasets

7.1 Dataset A

7.2. Dataset B

7.3. Covariates

We used several covariates to fit both the pollinator and plant ISDMs. The names of the covariates, their description and source are presented in Table 4. Additionally, we include the indication of whether the covariate is a habitat, trait or climatic variables as well as which of the ISDMs it was used to fit in Table 4.

Table 4: Description of covariates used to fit the integrated species distribution models and the source the covariates were retrieved from. The table indicated what type of covariate (habitat, trait and climatic) and which of the distributions it was used for (pollinator and/or plant) ISDM
Name Description Source Habitat/ Trait/ Climatic Disribution
bio1 Annual temperature geodata R-package Climatic Pollinator, Plant
cellID 1 km grid cell created from the bio1 raster NA Trait NA
Latitude Latitudinal gradient extracted from the bio1 raster NA Trait Pollinator
mBio1 Mean pollinator annual temperature per each pollinator species NA Trait Pollinator
sdBio1 Standard deviation of pollinator annual temperature per each pollinator species NA Trait Pollinator
mLat Mean pollinator latitude per each pollinator species NA Trait Pollinator
sdLat Standard deviation of pollinator latitude per each pollinator species NA Trait Pollinator
RegCom NA NA Trait Pollinator
landCover Land cover raster NA Habitat Pollinator, Plant
soilPh NA NA Habitat Plant
soilCoarseFraction NA NA Habitat Plant
soilOrganicCarbon NA NA Habitat Plant
soilMoisture NA NA Climatic Plant
bio1sq Square of the annual temperature NA Climatic Pollinator
bmInt Interraction between the annual temperature and mean pollinator annual temperature per each pollinator species NA Climatic Pollinator
bmIntsq square of interraction between the annual temperature and mean pollinator annual temperature per each pollinator species NA Climatic Polinator

We present the distribution of the covariates described in Table 4 here in Figure 4.

Figure 4: Covariates used to fit the ISDM.

8. Spatial units

9. Analyses

We fitted an ISDM using the point process framework [REF] to the merged pollinator presence-only (PO) and presence-absence (PA) datasets. ISDMs are various models that are linked together by a shared ecological process.

Using the poisson process framework, we model the presence-only data as a Poisson point process model [REF] with mean intensity \(\lambda_i(s)\) for each pollinator \(i \in \{bees, butterflies, hoverflies\}\). This mean intensity is modelled as: \[\begin{equation} \label{eq:int} \ln(\lambda_i(s)) = \beta_{0i} + \beta_{ji} X_j(s) + \omega_i(s) + \eta(s), \end{equation}\] where \(\beta_{0i}\) is the pollinator-specific intercept, \(\beta_{ji}\) is the pollinator-specific effect of covariate \(j\), \(X_j(s)\) is the \(j^{th}\) covariate field, \(\omega_i(s)\) is the pollinator-specific spatial autocorrelation field and \(\eta(s)\) is the bias field. \(\omega_i(s)\) and \(\eta(s)\) are modeled as zero mean Gaussian random field (i.e. \(\omega_i(s) \sim N(0, \Sigma)\), where \(\Sigma\) is a Mat’ern covariance matrix with variance \(\sigma_{\omega}^2\) and range \(\kappa\) and \(\eta(s) \sim N(0, \Sigma_\eta)\), where \(\Sigma_\eta\) is a Mat’ern covariance matrix with variance \(\sigma_{\eta}^2\) and range \(\kappa_\eta\)).

We model the PA with a logistic regression model. Let \(y_i(s)\) be a binary value at location \(s\) with \(0\) indicating absence of pollinator \(i\) and \(1\) indicating presence of pollinator \(i\). Then \(y_i(s) \sim \text{Bernoulli}(\Psi_i(s))\) with: \[ cloglog(\Psi_i(s)) = \beta_{0i} + \beta_{ji} X_j(s) + \omega_i(s) \] where the parameters are defined in equation \(\ref{eq:int}\).

All the ISDMs were fitted using the R-package PointedSDMs [REF].

9.1. ISDM for pollinators

Using the model framework defined above, we fitted the ISDM to the pollinator dataset obtained from GBIF.

model <- PointedSDMs::startSpecies(PointsBeesAndButterfliesAndHoverflies, # list of pollinator dataset (containing both mergedDatasetPA and mergedDatasetPO) 
                                   Boundary = regionGeometry, # boundary of the study
                                   Projection = newCrs, # projection
                                   Mesh = meshForProject, #mesh used for the model
                                  speciesEnvironment = TRUE, # Should we have pollinator specific covariate effect
                                  speciesIntercept = TRUE, # TRUE treats the intercept as a random effect, instead of constrained as with a fixed effect 
                                  pointsIntercept = FALSE, #should we include intercept for each dataset
                                  pointCovariates = FALSE, #do we have covariates for the presence-only data
                                   responsePA = 'individualCount', # column name of the response values for presence-absence data
                                  # trialsPA = 'trials',
                                   spatialCovariates = envCovsForModel, # raster of spatia covariates
                                   speciesName = 'Taxon', # Names of the species in the datasets
                                   pointsSpatial = NULL, # Do not include a dataset spatial field
                                   speciesSpatial = "replicate")   # unique spatial field per species, but they share the same hyperparameters.

# Add second bias field for PO data
model$addBias(datasetNames = 'mergedDatasetPO')

Priors

We assume the following priors for the pollinator ISDM:

  • The probability that the standard deviation of the pollinator spatial field is greater than \(0.1\) is \(0.1\) (i.e. \(P(\sigma_\omega > 0.1) = 0.1\)).

  • The probability that the standard deviation of the bias field is greater than \(0.1\) is \(0.1\) (i.e. \(P(\sigma_\eta > 0.1) = 0.1\))

  • The probability that the spatial range of the pollinator spatial field is greater than \(50\) is \(0.1\) (i.e \(P(\kappa_\omega < 50) = 0.1\))

  • The probability that the spatial range of the pollinator spatial field is greater than \(50\) is \(0.1\) (i.e. \(P(\kappa_\eta < 50) = 0.1\))

# hyper parameters of the spatial field (shared across species)
model$specifySpatial(Species = TRUE,  # define same prior for the all species
                     prior.sigma = c(0.6, 0.1),  # SD of field variation; 
                     prior.range = c(5, 0.1))  # range of spatial correlation; 


model$specifySpatial(Bias = TRUE, # Change the prior
                                 prior.sigma = c(1, 0.1),
                                 prior.range = c(5, 0.1))

model$specifyRandom(
  # precision parameter on how much each species' spatial field (how much they can deviate from the shared ___)
  speciesGroup = list(model = "iid", 
                      hyper = list(prec = list(prior = "pc.prec",
                                               param = c(0.1, 0.1)))), 
  # precision parameter on the baseline species occurrence rate
  speciesIntercepts = list(prior = 'pc.prec',
                           param = c(0.1, 0.1)))  

Fit model and make predictions

We fit the model and make predictions of the pollinator intensity within the study region.

modelRun <- PointedSDMs::fitISDM(data = model, 
                            options = modelOptions)
  
  # predictions for this model
  individualDatasetPreds <- predict(modelRun,
          data = ppxl,
          predictor = TRUE,
          n.samples = 100)

9.2. ISDM for plant species

Modelling all the \(54\) species required memoey allocations we did not have, so we split the species into groups with \(10\) species in each group. We fitted independent ISDMs for each of the groups.

# PointedSDMs takes a longer time to fit for many species
# The trick to to break it into smaller groups
# The split will be done by the number of present speccies in each location
# I will split it nGroups time
sortedSpecies <- table(asoDatasf$acceptedScientificName)%>%
  as.data.frame()%>%
  dplyr::arrange(desc(Freq))%>%
  select(Var1)%>%
  c()

# Set the number of groups
nGroups <- 10

#split the species into groups of 10 species in each
plantSpeciesGroup <- split(sortedSpecies$Var1, 
                           seq(1, 
                               ceiling(length(sortedSpecies$Var1)/nGroups)))

Similar to the model the pollinator ISDM, we fitted an ISDM to the \(54\) plant species. The ISDM has species-specific intercept, covariate effect and spatial field (but all the species share the same hyperparameters). We also added a spatial bias field to the observation model for the presence-only dataset.

speciesModelShared <- PointedSDMs::startSpecies(formatPlantData,
                                                Boundary = regionGeometry, 
                                   Projection = newCrs, 
                                   Mesh = meshForProject,
                                   responsePA = 'individualCount',
                                   speciesEnvironment = TRUE,
                                   speciesIntercept = TRUE,
                                   spatialCovariates = envCovsForPlantSpeciesModel, 
                                   speciesName = 'simpleScientificName',
                                   pointsIntercept = FALSE,
                                   pointsSpatial = NULL, # Do not include a dataset spatial field
                                   speciesSpatial = "replicate")  

# Add bias spatial field for the PO dataset
speciesModelShared$addBias(datasetNames = 'mergedPlantsPO')

Priors We assume the following priors for the pollinator ISDM:

  • The probability that the standard deviation of the pollinator spatial field is greater than \(0.1\) is \(0.1\) (i.e. \(P(\sigma_\omega > 0.1) = 0.1\)).

  • The probability that the standard deviation of the bias field is greater than \(0.1\) is \(0.1\) (i.e. \(P(\sigma_\eta > 0.1) = 0.1\))

  • The probability that the spatial range of the pollinator spatial field is greater than \(50\) is \(0.1\) (i.e \(P(\kappa_\omega < 50) = 0.1\))

  • The probability that the spatial range of the pollinator spatial field is greater than \(50\) is \(0.1\) (i.e. \(P(\kappa_\eta < 50) = 0.1\))

# hyper parameters of the spatial field (shared across species)
speciesModelShared$specifySpatial(Species = TRUE,  # define same prior for the all species
                     prior.sigma = c(1, 0.1),  
                     prior.range = c(5, 0.1))  


speciesModelShared$specifySpatial(Bias = TRUE, # Change the prior
                     prior.sigma = c(0.6, 0.1),
                     prior.range = c(5, 0.1))

# prior for random effects (mesh nodes of spatial field and species intercepts)
speciesModelShared$specifyRandom(
  # precision parameter on how much each species' spatial field (how much they can deviate from the shared ___)
  speciesGroup = list(model = "iid", 
                      hyper = list(prec = list(prior = "pc.prec",
                                               param = c(0.1, 0.1)))),  
  # precision parameter on the baseline species occurrence rate
  speciesIntercepts = list(prior = 'pc.prec',
                           param = c(0.1, 0.1)))  

Model fitting and predictions

# specify the model options for INLA
modelOptions <- list(control.inla = list(int.strategy = 'eb', 
                                         diagonal = 0.1, 
                                         cmin = 0), 
                     verbose = FALSE, 
                     safe = TRUE)

# Species-specific spatial effects model
speciesSharedEst <- PointedSDMs::fitISDM(data = speciesModelShared, 
                            options = modelOptions)

# I proceed with the prediction of occupancy probabilities
# over the entire region
individualDatasetPreds <- predict(speciesSharedEst,
                                  data = ppxl,
                                  predictor = TRUE,
                                  n.samples = 10)

9.3 Species Interractions

  load("../data/webPlot.RData")
  #interractionMatrix <- readr::read_csv(paste0(dataFolder,"/interractionsData/interractionMatrix.csv"))
  
    bipartite::plotweb(WebReady[[1]])

Webplot of plant species and pollinator interractions.

9.4. Diversity Indices

To define the diversity of the pollinators, we estimate the pollinator intensity given their interaction with the plants species within the location \(s\). This is estimated as: \[\begin{equation} \begin{split} \text{Intensity for intensity} &\propto \text{Estimated pollinator intensity} \times \text{Interraction weight} \times \text{plant species occurrence probability}\\ \implies \lambda^{\star}_i(s) &= \frac{\lambda_i(s) \times w_{ik} \times \Psi_k(s)}{\sum_k \lambda_i(s) \times w_k \times \Psi_k(s)}, \end{split} \end{equation}\] where \(w_k\) is the weight of the interaction between pollinator \(i\) and plant species \(k\), \(\Psi_k(s)\) is the occurrence probability of plant species \(k\) and \(\lambda_i(s)\) is the intensity of pollinator \(i\).

We estimated the Hill’s diversity indices for the pollinators. The Hill’s diversity indices are defined as: \[ H(s) = r_i(s) \times log(r_i(s)); \] where \(r_i(s) = \frac{\lambda^{\star}_i(s)}{\sum_{i} \lambda^{\star}_i(s)}\).

10. Results

10.1. Distribution of pollinators accross Norway

We present the predictions of log-intensity of the pollinators in Figure 5 and its standard error in Figure 6. The log-intensity for the three pollinators: bees, butterflies and hoverflies are similar with standard error close to \(0\). This result can be seen as a direct consequence of the estimates of climatic and trait variables on the pollinator distribution (Figure 7 and Figure 8).

Figure 5: Log intensity of pollinators (bees, butterfliesand hoverflies) across the study region from the fitted integrated species distribution model.
Figure 6: Standard error of the log-intensity of pollinators (bees, butterfliesand hoverflies) across the study region from the fitted integrated species distribution model.

The effects of traits and climatic variables on the pollinator distribution in Figure 7. There is a negative latitudinal effect on bees and butterflies, but a positive effect on hoverflies. However, there is a positive effect of annual temperature on butterflies and bees, but a negative effect on hoverflies. There is an insignificant trait effects on the hoverflies distribution as their credible interval of these covariates includes \(0\). The contrast of this statement is true for the bees and butterflies distribution.

Figure 7: Effects of traits and climatic variables on the pollinator distribution.

In contrast to the trait and climatic variables, there seems to be … (Figure 8).

Figure 8: Plant species land cover effects.
Figure 9: Plant species Predictions.
Figure 10: Diversity of pollinators across Norway.

11. Export file